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Abstract 

We develop a general method for power spectrum analysis of three dimensional redshift sur- 
veys. We present rigorous analytical estimates for the statistical uncertainty in the power 
and we are able to derive a rigorous optimal weighting scheme under the reasonable (and 
largely empirically verified) assumption that the long wavelength Fourier components are 
Gaussian distributed. We apply the formalism to the updated l-in-6 QDOT IRAS redshift 
survey, and compare our results to data from other probes: APM angular correlations; 
the CfA and the Berkeley 1.2Jy IRAS redshift surveys. Our results bear out and further 
quantify the impression from e.g. counts-in-cells analysis that there is extra power on large 
scales as compared to the standard CDM model with Qh ~ 0.5. We apply likelihood anal- 
ysis using the CDM spectrum with Qh as a free parameter as a phenomenological family of 
models; we find the best fitting parameters in redshift space and transform the results to 
real space. Finally, we calculate the distribution of the estimated long wavelength power. 
This agrees remarkably well with the exponential distribution expected for Gaussian fluc- 
tuations, even out to powers of ten times the mean. Our results thus reveal no trace of 
periodicity or other non-Gaussian behavior. 
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1. INTRODUCTION 

The IRAS QDOT survey (see e.g. Efstathiou et al. 1990) was designed for the study of 
large-scale structure, and has been analyzed statistically in a number of ways: counts in 
cells (Efstathiou et al. 1990) topology (Moore et al. 1992) etc. It has also been used to 
predict the peculiar motions of the local group (Rowan-Robinson et al. 1990; Strauss et al. 
1990; Strauss et al. 1992) and of other galaxies (Saunders et al. 1991) and thereby give an 
estimate of the density of the universe. Here we shall present the results of power-spectrum 
analysis of this survey, preliminary results of which have appeared elsewhere (Kaiser 1992; 
Feldman 1993). 

The present study follows the program laid down by Peebles in his pioneering series 
of papers (Yu & Peebles 1969; Peebles 1973; Peebles & Hauser 1974) for statistical analysis 
of galaxy catalogues via low order correlation functions. The power spectrum P(k) is the 
Fourier transform of the spatial two-point function £(r) and, as such, can only provide a 
partial description of the nature of the large-scale structure. While limited in information, 
the two-point function does hold a rather special place: if the initial fluctuations were 
Gaussian then the power spectrum provides a complete description of the fluctuations. 
It also provides a good starting point for higher-order analyses and, in any case, our 
knowledge of the two-point function is as yet far from complete, particularly on large 
scales. 

While Peebles emphasized the role of the power spectrum as a tool for two-point 
analysis, most work on real data has tended to use the auto-correlation function as the 
primary estimator. Recently, the power spectrum has made something of a comeback and 
has been applied to the CfA survey(s) (Baumgart & Fry 1991; Vogeley et al. 1992), to 
pencil beam surveys (Broadhurst et al. 1990; see also Kaiser & Peacock 1991) and to both 
l-in-6 0.6 Jy (Kaiser et al. 1991) and complete 1.2-Jy surveys (Strauss et al. 1990; Fisher 
et al. 1993) derived from Version 2 of the IRAS point source catalogue (Chester, Beichmann 
& Conrow 1987). Power spectrum analysis has also been applied to radio galaxies (Webster 
1977; Peacock & Nicholson 1991), to clusters of galaxies (Peacock & West 1992; Einasto, 
Gramann & Tago 1993) and to the Southern Sky Redshift Survey (Park, Gott & da Costa 
1992). 

Since P(k) and £(r) are Fourier transform pairs, one might question what is gained 
by all this analysis. While complete knowledge of the two-point function £ (r) is equivalent 
to complete knowledge of the power spectrum P(k) since they are a Fourier transform 
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pair, the same is not true of estimates £ and P derived from finite and noisy observational 
samples. There may be benefits to be derived from having both estimates and, depending 
on the details of the survey at hand, there will be relative advantages of one method over 
the other. 

Whereas the auto-correlation function measures the excess probability of finding a 
pair of galaxies in two volumes separated by some distance, the power spectrum directly 
measures the fractional density contributions on different scales. This, in general, is a more 
natural quantity, and one that is being supplied by theories that describe the state of the 
Universe at early times e.g. inflationary theories. Furthermore the correlation function, 
especially on large scales, is very sensitive to assumptions we make about the mean density 
n (the correlation function £ 12 + 1 = n 1 n 2 /n 2 where 1,2 refer to two volumes); in fact, 
the scales where ^ C 1 are the ones we are most interested in when studying large-scale 
clustering. In contrast, the power spectrum, which gives us a direct measurement of dp/p, 
scales as n -1 and so determines the density field on large scales more robustly. Uncertainty 
in our knowledge of the mean density may amplify the error in the correlation function; 
however, the shape of the power spectrum should be unaffected by incomplete knowledge 
of n which is measured by the k = mode. 

Another general advantage of the power spectrum derives from the fact that the true 
P(k) is a positive quantity. Thus, in interpreting an estimate P(k) — which will not nec- 
essarily be positive — we have important extra information at our disposal. It is therefore 
possible to recognize directly unphysical results which may indicate some problem with 
data or analysis program. For the auto-correlation function this information translates 
into an infinite number of integral constraints: the integral of £(r) times exp(zk ■ r) must 
be positive for every value of k, but this seems relatively obscure. 

A related advantage of the power-spectrum analysis involves the error analysis. In 
general, the statistical uncertainty in any two-point function involves the 4-point function 
which is at best poorly known. An interesting error estimate — and the relevant one if 
one is interested in testing models like CDM — is obtained if we assume that the large- 
scale structure resembles a Gaussian random field. When we take the Fourier transform 
of the galaxies in the survey we are, roughly speaking, transforming the product of the 
infinite random sea of density fluctuations /(r) with a window determined by the survey 
geometry. The transform will therefore be the convolution of the Fourier coefficients /(k) 
with a 'point-spread function' (psf) which, for the IRAS survey, is a compact ball with 
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width equal to the inverse of the depth of the survey. For a Gaussian field, the coefficients 
/(k) are totally uncorrelated with one another, so, aside from the a short range coherence 
imposed by the convolution, estimates of the power will also be statistically independent. 
In the correlation function representation the fluctuations in £(r) at different r will be 
correlated in some rather complicated way. 

These advantages are obtained at a price, and that price is particularly high when 
the survey geometry is highly irregular. With a pencil beam survey, for instance, the psf 
is a pancake with transverse dimension 1/w, where w is the width of the needle, and the 
direct estimate of the power at low spatial frequencies will be strongly contaminated by 
'aliasing' from high spatial frequencies (Broadhurst et al. 1990; Kaiser & Peacock 1991). 
The interpretation of the results are then quite difficult and requires careful modeling of 
the convolution. In a direct estimate of £(r) no such problem arises, although estimating 
the statistical error in £ is still problematic for highly irregular geometries. 

In this paper we concentrate on the development of a formalism that is designed 
to construct a descriptive statistic that measures the power spectrum of the underlying 
density fluctuation field assuming that the fluctuation field is some homogeneous and 
statistically isotropic random process. We apply the formalism to surveys with a large 
baseline in all three dimensions, in particular, to the flux limited l-in-6 IRAS QDOT 
survey that covers 73.9% of the sky and has 1824 galaxies with redshift that corresponds 
to radii of 20/i _1 Mpc < R < 500/i _1 Mpc (where h = H /l 00 kms" 1 Mpc" 1 , as usual) 
and galactic latitude |6| > 10°. We have used a revised version of the QDOT database in 
which a redshift error that afflicted approximately 200 southern galaxies has been corrected 
(courtesy of A. Lawrence). Since the QDOT survey is deep and covers nearly the whole sky 
we feel that in this case, and in general for surveys of that type, power-spectrum analysis 
would seem to be the method of choice. 

The layout of the paper is as follows: In §2 we describe the method; this is essentially 
the use the power spectrum of the weighted galaxy counts, with the weight optimized for 
Gaussian fluctuations. A new feature of this work as compared with previous studies is 
the rigorous error analysis and optimized weighting scheme. In §3 we apply the method 
to the QDOT l-in-6 survey. The results which confirm the impression from e.g. the 
counts-in-cells analysis that there is an excess of large-scale power relative to the CDM 
predictions (if we normalize to fluctuations at the ~ 8/i _1 Mpc scales) and that the excess 
appears as a 'bump' in P(k) around wavelengths A ~ 100 — 150/i _1 Mpc. In §4 we compare 
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the results with other probes of the large-scale power spectrum, theoretical models and 
analysis from other surveys. In §5 we examine the statistical fluctuations in the power and 
compare these with the Gaussian expectation. We conclude in §6. 



2. METHOD 



In §2.1 we construct our estimator of the power P(k). In §2.2 we calculate the variance 



in P(k) under the assumption that the Fourier transform of the galaxies is approximately 
Gaussian distributed. In §2.3 we find the optimum weighting scheme, and in §2.4 we will 
convert the results to practical formulae to implement on the computer. In §2.5 we present 
the covariance matrix. 

2.1 The Estimator 

We make the usual assumption that the galaxies form a Poisson sample (Peebles 1980) of 
the density field 1 + /(r): 



where n(r) is the expected mean space density of galaxies given the angular and luminosity 
selection criteria, and we wish to estimate the power spectrum 



where £(r) = £(r) = (f(r')f(r' + r)). Our Fourier transform convention is /(k) = 
J d 3 rf(r) exp(ik • r) so /(r) = (2n)~ s J d s k /(k) exp(— zk • r). Since we are dealing 
with a random field of infinite extent, the numerical value of /(k) is not well defined, 
but if we set /(r) = outside of some enormous volume V then (/(k)/*(k))/V = 
/ d 3 r (/(r')/(r' + r)) exp(zk • r) tends to a well defined limit and it is this quantity that 
we call P{k). 

Our approach is to take the Fourier transform of the real galaxies minus the trans- 
form of a synthetic catalogue with the same angular and radial selection function as the 
real galaxies but otherwise without structure. We also incorporate a weight function w(r) 
which will be adjusted to optimise the performance of the power-spectrum estimator. We 
define the weighted galaxy fluctuation field, with a convenient normalization, to be 



P(vol element 5V contains a galaxy) = 5Vn(r) [l + /(r)] 



(2.1.1) 




(2.1.2) 



F(r) 



w(r) [n g (r) - an s (r)] 
[J d 3 rn 2 (r) w 2 (r)] in 



(2.1.3) 
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where n g (r) = J2 i S(r — r$) with r; being the location of the i th galaxy and similarly for a 
synthetic catalogue which has number density I /a times that of the real catalogue. 

Taking the Fourier transform of F(r), squaring it and taking the expectation value 
we find: 

2 Jd 3 r J d 3 r' w(r)w(r'){[n g (r) - an s (r)}[n g (r') - an s (r')})e^< r - r ^ 

\l^l k JI) = r ,3 -2/ \ 2T^ y ZAA ) 

J d 6 rn (r) w z (r) 

With the model of equation 2.1.1, the two point functions of n g , n s are 

K(r)n 5 (r')) = n(r)n(r') [1 + £(r - r')] + n(r)8(r - r') 

(n s (r)n a (r')> = a~ 2 n( r )™( r/ ) + a _1 n(r)5(r - r') (2.1.5) 

(n g (r)n s (r')) = a _1 n(r)rl(r / ) 

(see appendix A), so 

<l F ( k )l 2 > = / ^ P ^ G{ ^ ~ k/)|2 + (1 + a) li T -vfv\ (2 ' L6) 

J {In)* J d A r n (r) w z (r) 



^(k) - T: 7. i/2 - ( 2 - L7 ) 



where 

/ d 3 r n(r) w(r) e* k ' r 
[fd 3 rn 2 (r)w 2 (r)\ 

The content of this result is readily understood. The density field in our data is the true 
infinite density field times some mask, so in Fourier space we obtain a convolution between 
the transforms of the true density field and of the mask. The fair-sample hypothesis as- 
sumes that there are no phase correlations between density and mask, so that the power 
spectrum of the data is the true power spectrum convolved with that of the mask. For 
Poisson-sampled density fields, we obtain in addition a constant shot-noise contribution 
to the power. This arises because the discrete density field has a delta function in its cor- 
relation function. Since £(0) is assumed to be better behaved than this for the underlying 
density field, this discreteness contribution is usually subtracted. 

For the IRAS survey G(k) is a rather compact function with width ~ 1/-D, where D 
characterizes the depth of the survey, Provided we restrict attention to |k| 3> 1/D, which 
is really just the requirement that we have a 'fair sample', and provided -P(k) is locally 
smooth on the same scale, then 

(|F(k)| 2 )^P(k) + P shot , (2.1.8) 
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so the raw power spectrum |F(k)| 2 is the true power spectrum plus the constant shot noise 
component 

= (1 + a) Jd 3 rn(r) w 2 (r) 
J d 6 r n (r) w z (r) 

Our estimator of -P(k) is just 

P(k) = |F(k)| 2 -P shot , (2.1.10) 



and to obtain our final estimator of P(k) we average P(k) over a shell in /c-space 

1 



P(k) = I d 3 k'P{k'), (2.1.11) 

v fe 



where Vk is the volume of the shell. 

Equations 2.1.3, 2.1.9-11 provide our operational definition of P(k). To use these 
we must specify the weight function w(r) which so far has been arbitrary, and we must 
choose some sampling grid in /c-space. In order to set these wisely — and also to put error 
bars on our estimate of the power — we need to understand the statistical fluctuations in 
P(k). 

2.2 Statistical Fluctuations in the Power 

From equation (2.1.11) the mean square fluctuation in P(k) is 

ap = ((P(k)-P(k)) 2 ) = [ d 3 k [ d 3 k'(5P(k)5P(k')). (2.2.1) 
\ v J I v k Jv k Jv k 

which depends on the two point function of 5P(k) = P(k) — P(k). Since P(k) is itself a 
two-point function of F(r) this depends on the four-point function which is poorly known. 
An interesting model for the two point function of 5P(k) is to assume that the Fourier 
coefficients F(k) are Gaussian distributed, in which case (5P(k)5P(k')) = \(F(k)F*(k'))\ 2 
(see appendix B). There are several factors which motivate the Gaussian assumption. 
The Fourier transform of F(r) at some low spatial frequency k will receive contributions 
both from real low frequency density fluctuations and from small scale clustering and 
discreteness of galaxies. The latter will tend to produce Gaussian fluctuations by virtue 
of the central limit theorem, and, in the simplest inflationary scenarios at least, the long- 
wavelength fluctuations will also be Gaussian distributed. Perhaps a stronger motivation 
is that the Gaussian hypothesis seems to agree well with the observations; see §5. 
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We can calculate (F(k)F*(k')) by a simple generalisation of the steps leading to 
2.1.6, and we find 

^^P(k")G(k - k")G*(k' - k") + S(k' - k), (2.2.2) 



where we have defined 



(l + a)f d 3 rn(r)w 2 (r)e l]i " 
J d 3 rn 2 (r) w 2 (r) : 



and, in the same approximation that led to equation (2.1.8) we obtain 

(F(k)F*(k + 5k)) ~ P(k)Q(Sk) + S(6k) (2.2.4) 



where 



and therefore 



Q k = J ? , 3 I/ v y l v , 2.2.5 
J err rr(r) w 2 (r) 



(<yp(k)<yp(k')) = |P(k)Q(5k) + S{5k)\ 2 . (2.2.6) 



Under the Gaussian assumption, F(k) and dP(k) take the form of locally homoge- 
neous random processes with 2-point functions whose shapes are determined by the survey 
geometry. The variance in the power (5P(k) 2 ) is just the square of the total power (i.e. 
signal power plus shot noise power), and the two point function of the power is again a 
compact function with width ~ 1/D. Thus, the estimator of the power behaves like an 
incoherent random field smoothed on scale 5k ~ 1/D. Note the relation between Q(5k) 
and G(5k) defined in 2.1.7; these are both measures of the coherence in k space. The fact 
that G depends on n and Q on n 2 is just the usual difference between the transform of a 
field and the transform of its two-point function. 

In this regard the power spectrum estimator is very different from the correlation 
function estimator. If we have some continuous field f(r) that we view through a survey 
'window' of scale D, then as D becomes large the estimator of the two-point function 
tends asymptotically to £(r). The microscopic fluctuations in -P(k), in contrast, remain 
of order unity even as D — > oo, but the coherence length for the fluctuations shrinks so 
when we average over some finite volume of frequency space the averaged power tends to 
P{k), and the fractional fluctuations in P(k) will be on the order of the square root of the 
number of coherence volumes averaged over in equation 2.1.11. 



8 



Equation 2.2.1, 2.2.6 can be used in a number of ways: With P(k) = P(k) we obtain 
self-consistent error bars; with P(k) = P CDM (fc), for instance, we obtain the expected 
fluctuations for this specific theory, and with P(k) = we obtain the 'Poisson' error bars 
widely used in the past, though these of course tend to underestimate the real uncertainty. 

It should be stressed that the Gaussian model is only an assumption. An alternative 
would be to estimate (5P(k)5P(k + 5k)) directly from -P(k) — as well as providing an 
empirical error estimate, by comparing this directly with equation 2.2.6 we could obtain 
an interesting test of the Gaussian nature of the fluctuations. This issue will be discussed 
in greater depth in §5. 

2.3 Optimum Weighting 

If the shell we average over in equation 2.1.11 has a width which is large compared to the 
coherence length then the double integral in 2.2.1 reduces to 

a 2 P (k)~±-J d 3 k'\P(k)Q(k')+S(k')\ 2 , (2.3.1) 

so, with the definition of Q(k) and S(k) and using Parseval's theorem, the fractional 
variance in the power is 

aUk) _ (2n) 3 Jd 3 rn 4 w 4 [l + l/nP(k)] 2 
P 2 (k) V k [Jd^rn 2 w 2 ] 2 

We seek w(r) which minimises this. Writing w(r) = wo(r) + 5w(r) and requiring 
that <Jp(k) be stationary with respect to arbitrary variations Sw(r) we obtain 



jd s rn 4 w 3 {^f) 2 5w(r) _ J d 3 rn 2 w 5w(r) 
Jd 3 rn 4 w*(^) 2 ' ~ jd^rn 2 w 2 



(2.3.4) 



and it is easy to see by direct substitution that this is satisfied if we take 



This is the optimal weighting (under the assumption that the fluctuations are Gaussian). 
It is the analogue of the "1 + Attu J3" weighting scheme often used in correlation analysis, 
and, just as in that case, the procedure is somewhat circular since one needs a preliminary 
estimate of P(k) in order to set the weighting. 
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The asymptotic behavior of this weighting scheme is very reasonable: Say we want 
to measure density fluctuations on a particular scale A. Since n(r) is a rapidly decreasing 
function we will have two regimes: For small r, we will have many galaxies per A 3 volume, 
so the error will be dominated by the finite number of independent 'fluctuation volumes' 
and consequently one would like to give equal weight per volume, or equivalently to weight 
galaxies in proportion to 1 jn. At large radius we are dominated by shot-noise and conse- 
quently one would like to weight galaxies equally. Equation 2.3.5 is in accord with these 
notions. Note that the optimal weight depends on the spatial frequency. Insofar as the 
power tends to decrease with frequency this means that when we measure long wavelengths 
we will be inclined to give greater weight to the more distant galaxies. 

An important exception to the above analysis arises when the sampling of the density 
field is not Poissonian. This arises in practice when the sky is divided into a number of 
zones, and a fixed number of redshifts per zone are measured, independent of the actual 
number of galaxies present (e.g. the Campanas redshift survey: Shectman et al. 1992). 
Clearly, a gross underestimate of the power spectrum would result if we were simply to 
treat the measured redshift data as a Poisson sample. The correct procedure is to weight 
the galaxies according to the sampling factor: if a given galaxy is part of the fraction / 
sampled from a given field, then we form the density field n g (r) — £\ f~ 1 5(r — n). This 
gives an estimate of the density field which is corrected for the variable sampling, and all 
that alters is the shot noise term. The first of equations 2.1.5 now becomes 

{n g (r)n g (r')} = n(r)n(r')[l + f (r - r')] + ra(r)/ _1 (r)<y(r - r') (2.2.7) 

so that the shot term in 2.1.6 is greater then it would have been for the same number 
of galaxies with constant sampling. Otherwise, the analysis goes through as before. This 
may seem a little surprising, given that the sampling factors are not imposed in advance, 
but 'know' about the large-scale properties of the density field. However, study of the 
derivation of equation 2.1.5 given in appendix A shows that there is no problem. The 
critical term is the evaluation of (nitij) for two different cells. Even if the mean density 
of objects has been adjusted to reflect the average density perturbation over some region, 
galaxies are still distributed at random within that region, which is all that is required in 
order to write (n^nj) in terms of £(14 — rj). 
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2.4 A Practical Algorithm. 

We now summarize the results, and write all the integrals in terms of discrete sums as will 
be evaluated on the computer. Let us assume that we are provided with the coordinates 
for the real and synthetic catalogues plus a function which returns the value of n(r) (the 
details of how the synthetic catalogue was actually constructed for the IRAS survey are 
given below) . Alternatively, the local value of n may be provided along with the coordinates 
of the synthetic galaxies. 

We need to evaluate F(k) and also the auxiliary functions Q(k) and 5'(k). We 
know that these will be smooth on scale 5k ~ 1/-D, so we evaluate these on a cartesian 
grid. We would like to emphasize that the geometry of the survey requires a cartesian 
grid, and so we sample k— space linearly and present the results as linear-log plots rather 
than the traditional log-log plots. A reasonable value for the grid spacing for the IRAS 
survey is 5k = 0.02/i Mpc -1 . Since the functions Q(k) and S(k) will be rather compact we 
need only evaluate these on a fairly small grid. A convenient way to perform the spatial 
integrals is to use f d 3 rn(r) . . . — > a J2 S . . ., where the sum is over the synthetic galaxies 
which we assume are sufficiently numerous to define n. In order to set the weight scheme 
(equation 2.3.5) we need to assume some value for P(k). The approach we have taken 
is to use a range of values for P(k) — each of which provides a legitimate though not 
necessarily optimal estimate — and then one can select, for any range of wavenumber, the 
appropriate optimal estimator. Having chosen a value for P(k) it is convenient to adjust 
the normalization of the weight function so that 

/ d 3 rn 2 (r)w 2 (r) -> a ^n(r s ) w 2 (r s ) = 1 (2.4.1) 

s 

we then evaluate 

F(k) = / d 3 r w(r)[n 9 (r)-an s (r)]e ,k - r -> ^ «;(r ff )e ik ' r * - a ^ w(r s )e ik ' rs , (2.4.2) 

J 9 

> a^n(r s )«; 2 (r s y k - r s (2.4.3) 

s 

a(l + a)J2 w2 ( r s)^' rs - (2.4.4) 



Q(k) = j d 3 rn 2 (r) w 2 (r) e lk ; 



and 



S{k) = (1 + a) I d 3 rn(r) w 2 (r) e lk ' r 
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Our radially averaged power spectrum estimator is 

P(k) = ±- E \F(V\ 2 -S(0) (2.4.5) 

k fc<|k|<fc+<5fc 

where AT fe is the number of modes in the shell. 

In §2.3 we obtained the rather simple expressions 2.3.1, 2.3.2 for the variance in 
the power. These say that the rms fractional fluctuation in the power is just the square 
root of the number of 'coherence volumes' in /c-space: op (k)/P(k) = y/Vc/Vk, where the 
coherence volume is defined by V c = (2tt) 3 J <i 3 rn 4 w 4 (l + 1/nP). However, these require 
that the shell be thick compared to the coherence length. This is fine at high frequencies 
k 3> 1/-D, but at low frequency this would result in a loss of resolution. Instead, we use 
the analogue of 2.2.1 

^) = W E E \ P Q( k ' - k ") + - k")| 2 (2.4.6) 
k k / k // 

where k and k' are constrained to lie in the shell and which is valid for any shell thickness. 
If we make the shells thinner than or comparable to the coherence length then neighboring 
shells will be correlated, but the variance tends to a well defined finite value even as the 
shell tends to zero width. 
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2.5 Covariance Matrix 

The error estimate 2.4.6 is adequate to give an estimate of the variance in the power 
spectrum, but for a more precise assessment of theoretical models it is necessary to quantify 
the degree of correlation of P(k) at differing wave numbers. The way to do this is via 
likelihood analysis. 

Provided the fractional error is moderately small (i.e. the shell intercepts a suffi- 
ciently large number of coherence volumes), the fluctuations in the power will themselves 
tend to become Gaussian distributed, and the vector of estimates P t together with the 
correlation matrix allow one to evaluate the likelihood for any particular theory: 

-Cr. 1 [A-Pth (fci )] [P,-Pth (kj )] /2 

L[P th (k)} = p[Pi\P*(k)] = (27r)JV/2|c| • (2-5.1) 

This provides a quantitative way to compare theories. 

This is rather similar to other applications of likelihood analysis to cosmological 
data sets (Kaiser 1992, Bond & Efstathiou 1984), but there is a slight difference here in 
that the correlation matrix for the binned estimates of P actually depends on P(k) itself: 

C y = (SPMSPikj)) = J2 E l P ^( k " k ') + 5 ( k - k ')! 2 ( 2 - 5 - 2 ) 

k k' 

where k and k' lie in the shells around hi and kj respectively. 

Equation 2.5.2 is derived, like equation 2.2.6, under the assumption that P(k) is 
effectively constant over a coherence scale. With this assumption we can write 

Cij = C\f + C\f y ' P, h {h)P th {kj) + C\f P th {h)P th {kj) (2.5.3) 

The use of this will be illustrated in §4. 
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3. THE QDOT SURVEY 



3.1 The Survey. 

We apply the analysis to the updated (A. Lawrence 1993, private communication) IRAS 
QDOT survey (Efstathiou et al. 1990). The QDOT survey chooses at random one in six 
galaxies from the IRAS point source catalogue with a flux limit at 60/um / 60 > 0.6Jy. In 
this sample there are 1824 galaxies above galactic latitude \b\ > 10° with redshifts that 
corresponds to radii 20 h~ x Mpc < R < 500 h~ l Mpc. We have used a revised version of 
the QDOT database in which a redshift error that afflicted approximately 200 southern 
galaxies has been corrected. We have converted all redshifts to the local group frame. 

The survey provides us with an angular mask. The sky is divided into 41167 bins, 
each ?s 1° x 1° , some of which are masked. Since most of the bins below galactic latitude 
\b\ < 10° are masked because of obscuration due to the galactic plane, we masked off all 
bins that had \b\ < 10°. There are 30444 unmasked bins, a coverage of ps 74% of the 
sky. The QDOT galaxy distribution and mask are shown in figure 1. A more complete 
description of the sample is given in Efstathiou et al. (1990). 

To apply the above formalism to the QDOT survey, we wrote two independent codes 
that utilized different grids in /c-space and different forms for the radial selection function. 
We obtained very good agreement between the results for all values of P(k) in the weight 
function (Eqn. 2.3.5). 
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3.2 Construction of the Synthetic Catalogue 

Once we have the radial galaxy distribution from the survey, we bin the galaxies in bins 
of width dr and then fit, using a x 2 technique, a number function 

n ( r ) = A ( 7— 

\ ' max 

where 

A = 2^n max , (3.2.2) 

"-max = n(r = r max ) and r max is the radius where n(r) peaks (see figure 2 for the radial 
distribution of the IRAS QDOT galaxies and a best fit). We integrated n(r) and divided 
the function into n n bins, each of which has the same number of galaxies. To check the 
dependence on the exact choice of the radial selection function, we also tried log-normal 
bins and varied the parameters in the fitting function Eqn. 3.2.1 to give us a la effect. 
The effect of these choices on the final results is negligible. We constructed the synthetic 
catalogue by distributing galaxies with a Poisson distribution in all unmasked bins given 
the above selection function. 



1 + 



(3.2.1) 



3.3 The Power spectrum. 

We have seen that the optimum weight function depends on k (Eqn. 2.3.5), so that 
a different weighting should in principle be used at each wavenumber. This would be 
rather cumbersome, and so we have chosen in practice to evaluate the weight assuming 
a given constant level of power (i.e. a n = white noise power spectrum). By varying 
this assumed power over values that cover the observed power, it is easy to see what 
effect the exact choice of weight would have. We shall consider four different assumed 
power levels, which produce the different effective survey depths shown in figure 3. In 
figure 4 we show the power spectrum results for the full IRAS QDOT survey using these 
different weight function parameterizations. We see that the larger the power in the weight 
function is (i.e. the greater the effective depth of the survey) the more power we get. The 
effect is most marked at the shallow end: we gain roughly a factor 1.5 in power when 
the assumed power changes from 2000 (/i _1 Mpc) 3 to 4000 (/i _1 Mpc) 3 , but things change 
relatively little thereafter. This suggests that the effect is a local one that represents true 
sampling fluctuations, but that the correct average power is detected for greater depths. 
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We shall generally adopt the results with P = 8000 (/i _1 Mpc) 3 as our 'standard' set of 
values, since this power is closest to the average level in the k range of interest. 

We attempted to choose different selection criteria that might help us disentangle 
clustering effects that arise because the more distant galaxies are intrinsically more lumi- 
nous than those which are nearby. We divided the data into six samples with different flux 
limits, and for each we chose a number of weight functions by changing the power P(k) 
in w(r). We found no systematic effects that appeared to depend primarily on luminosity, 
rather than distance. 

The most interesting feature of the power-spectrum results is that the function is 
relatively smooth, with no significant sharp features. As is discussed in more detail below, 
there is an excess at 0.03 < k < 0.07 when comparing to the CDM power spectrum with 
Qh = 0.5, normalized around k = 0.1. This occurs for all weight functions we used as 
well as for most sub-samples. At the largest wavelengths, our data indicate a turnover in 
the power spectrum, with reduced power for k ^ 0.04 corresponding to A ^ 150 h -1 Mpc. 
Although the error bars are large, this also is a feature which appears to be robust with 
respect to changing the depth of the survey. There is the worry that the turnover may be 
a normalization effect: our power spectra must vanish at k = since we obtain the mean 
density from the data. Whether this is a problem depends on the convolution function G(k) 
(Eqn. 2.1.7), and is an issue discussed by Peacock & Nicholson (1991). If we approximate 
the function by a Gaussian \G(k)\ 2 = exp — k 2 R 2 , then the power-spectrum estimate is 
sensitive to normalization effects only for k ^ 1.5/ R. The appropriate values of R that fit 
our G function vary from R = eS/i^Mpc for P = 2000 (/i^Mpc) 3 to R = lOS/i^Mpc 
for P = 16000 (/i _1 Mpc) 3 . The turnover in the QDOT power spectrum is thus not an 
artifact of self-normalization. Furthermore, the location of the turnover agrees well with 
the position of the same feature as seen in other data sets (e.g. Peacock & West 1992). 

4. DISCUSSION OF POWER-SPECTRUM RESULTS 

In §4.1 we compare our results to other probes and in §4.2 we present theoretical model 
power spectra and compare them to the QDOT results. In §4.3 we present the results 
of CDM likelihood analysis in redshift space and in §4.4 we transform the results to real 
space. 
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4.1 Comparison With Other Probes 

In figure 5 we show a comparison between the shallow [P(k) = 2000 (h' 1 Mpc) 3 ] IRAS 
QDOT power spectrum and the 1.2-Jy Berkeley survey (Fisher et al. 1993). The error bars 
on the 1.2-Jy survey are derived from the standard deviation of 10 'mock' IRAS standard 
CDM simulations (see Fisher et al. 1993 for details). The analysis method used on the 
1.2-Jy data differs somewhat from that used here; they weighted volumes equally within 
some cylindrical volume of varying size, up to a maximum of length 180/i _1 Mpc by radius 
90/i _1 Mpc. This is equivalent in volume to a sphere of radius 103/i _1 Mpc. Comparing 
with figure 3, we see that our shallowest weight does give roughly constant weighting to 
volumes out to this radius, which is why we have presented the comparison in this way. 

Once scaled to the same depth, the surveys agree reasonably well. The 1.2-Jy 
data lie slightly below our shallow results; however, there would be a much more marked 
discrepancy if we had chosen to compare to the 'standard' P = 8000(/i _1 Mpc) 3 QDOT 
results — about a factor of 1.5 in power. Also, because of their choice of sampling in 
k— space, the 1.2-Jy resolution is not very high for large scales (small k) which perhaps 

led them to miss some of the structure of the power at scales « 100 150 hr 1 Mpc. As 

mentioned above, the survey geometry requires linear sampling of k— space, and so linear 
plotting of results. 

In figure 6 we show a comparison with data from the CfA surveys (Vogeley et al. 
1992). The power spectra are of galaxies from the CfA 1 and the CfA 2 surveys, divided 
into two categories: CfA 100 and the deeper CfA 145 (for details see Vogeley et al. 1992). 
The error bars are derived from estimating the 95% confidence level from the variation of 
100 'mock' CfA surveys from open CDM simulations. As with QDOT, the CfA results 
appear to show some weak trend for the power to increase with increasing sample depth. 
The shapes of the CfA and QDOT power spectra appear to differ: similar levels of power 
are seen at large wavelength, but the CfA spectra show much more small-scale power. 
However, note the relative sizes of the large — wavelength error bars in the CfA data, plus 
the fact that their analysis gives no idea of the extent of any cross-correlation between 
different points. At present, it is probably not possible to rule out with great confidence 
the hypothesis that CfA and QDOT power spectra have the same shape, but an amplitude 
different by a factor of about 2. Improved data will be interesting here, as a difference 
in the shapes of the power spectra for optically-selected and IRAS galaxies must hold 
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information about the processes which created the differing spatial distributions for these 
objects. 



4.2 Fitting Model Power Spectra 

It is interesting to fit the power-spectrum data by some analytical model. Since the result 
appears to be a relatively smooth function, it is convenient to use for this purpose the 
CDM linear power spectrum. This allows for parameterization of the slope and amount 
of curvature, depending on the density. Initially, we shall use this simply as an empirical 
way of describing the data; the physical interpretation of the results will be given later. 

The CDM power spectrum takes the form of a scale-invariant spectrum modified by 
some transfer function: P(k) oc kT^(k). We shall use the BBKS approximation (Bardeen 
et al. 1986), which is the most accurate fitting formula: 

T k = ln(1 2 + 3 4 Q 349) [1 + 3.89^ + (16.1qf + (5A6qf + (G.Tlg) 4 ]" 1 ' 4 , (4.2.1) 

where q = k/[Vth 2 Mpc -1 ]. Since observable wavenumbers are in units of /iMpc -1 , the 
shape parameter is the apparent value of flh. [This should not be confused with the 
parameter T defined by Efstathiou, Bond & White (1992); they considered a transfer func- 
tion which fits a CDM model with non-zero baryon density = 0.03. Empirically, the 
wavenumber scaling in the CDM model depends very nearly on £lh 2 exp[— 2fi B ]; Efstathiou 
et al. defined V = 0.5 to correspond to Qh = 0.5, and our values of Qh are thus smaller 
than T by a factor of 0.94.] The normalization of the power spectrum can be expressed 
in various ways (e.g. Peacock 1991); here we shall use erg — the linear rms in spheres of 
radius 8/i _1 Mpc. 

In figure 7 we show a comparison with a 'standard' linear CDM power spectrum 
having Qh = 0.5. The poor fit is apparent, although whether one regards this as the data 
having an excess of large-scale power or a lack of small-scale power is a matter both of 
taste and of where we choose to normalize. We further compare the results to the APM 
fitting formula of Peacock (1991): 



^ = k - :> TTW^ ' (4 ' 2 ' 2) 



A reasonable fit to the IRAS QDOT power spectrum is achieved with ko = 0.03 h Mpc -1 , 
and k c G [0.025, 0.04] /iMpc -1 . The overall shape and position of the break agree reason- 
ably well with the APM data; however, note that k c ~ 0.015/i Mpc -1 is required to best-fit 
the APM data, and such a large break wavelength appears to be excluded by our data. 
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We also compare our results with simulations of mixed [cold (70%) plus hot (30%)] 
dark matter (MDM) (Klypin et al. 1992). The 'red galaxies' power spectrum in the MDM 
simulations is the one of all the dark matter halos. The halos were defined to be at the 
maxima of the overdense regions with overdensity > 50. The 'galaxies' were displaced 
along the line of sight in accordance with there peculiar velocities to mimic redshift space 
and the density field was smoothed with a Gaussian filter of 1/2 a cell size radius to reduce 
shot noise (Klypin 1993, private communication). The overall shape of the power of the 
red galaxies in the MDM simulations agrees quite well with our power spectrum. As we 
shall see below, this is because the MDM scenario gives results that are rather similar to 
those of CDM models with low Qh or 'tilted' spectra. 

Rather than looking at a priori models any further, we shall now proceed to fit linear 
CDM power spectra to the QDOT redshift-space power-spectrum data. We emphasize 
that the resulting values of shape (Qh) and normalization (ag) are apparent redshift- 
space values only. Before they can be interpreted, we shall need to consider the effects 
of nonlinearities and distortions caused by the mapping between real space and redshift 
space. 

4.3 Results in Redshift Space 

Since we have a procedure for constructing the power spectrum covariance matrix for any 
given model (equation 2.5.3), it is easy to fit the CDM models correctly to our data using 
maximum likelihood. To within a constant, the likelihood is 

-lnL = x7 2 + [hidetC]/2, (4.3.1) 

and 

X 2 = C~ X \P - P ODM ]i[P - Pcdm]„ (4.3.2) 

where Cij is the covariance matrix for our data. 

Figure 8 shows contours of likelihood at — InL = minimum + 0.5, 1, 2, On 

a Gaussian approximation, the 95% confidence level would be at A InL = 3. This plot 
shows that the maximum-likelihood model is well defined (and is an excellent description 
of the data: % 2 = 12 on 18 degrees of freedom). The best-fitting parameters and their rms 
uncertainties are 

a 8 = 0.88 ± 0.07 (redshift space) 

(4.3.3) 

Qh = 0.25 ± 0.08 (redshift space) 
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We may also consider the use of the CDM transfer function with a power spectrum 
which is not scale-invariant. Possibilities include 'tilted' models: 



P oc k n ; n ~ 0.8 



(4.3.4) 



(Cen et al. 1992) or an inflationary prediction for logarithmic corrections to a scale- 
invariant spectrum (Kofman, Gnedin & Bahcall 1993). The latter corrections have a 
negligible effect on our results, but the use of tilted models is important. The apparent 
value of ag is insensitive to the assumed value of n, but the apparent redshift-space value 
of Vth changes approximately as 



Comparing these results with those from the Berkeley 1.2-Jy survey (Fisher et al. 
1992), we find reasonable agreement. Their apparent value of o~g is 0.80, with a best- 
fitting Qh = 0.2. These figures are well within our confidence limits. This is perhaps a 
little surprising, since we have seen earlier that there appears to be a difference in power 
of about a factor 1.5 between the 1.2-Jy results and the deeper QDOT results which are 
used here. On this basis, we would have predicted a lower value of cr 8 for the Berkeley 
data; the reason for this discrepancy is unclear. 



fi/i = 0.25 + 0.29([l/n]-l). 



(4.3.5) 
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4.4 Results in real space 

Our observed power spectrum in redshift space differs in three ways from the quantity of 
interest, which is the underlying linear-theory power spectrum of mass fluctuations. This 
is altered by nonlinear evolution, by redshift-space mapping and by bias. 

The last of these is easily dealt with through ignorance: we shall assume scale- 
independent bias relating the nonlinear power spectra 

-Preal = b -Fmass- (4-4.1) 

It is clearly reasonable to treat b as a constant if the galaxy distribution is close to being 
unbiased. For a significant degree of bias, one really needs a model; empirically, if our 
conclusions conflict with other data, this could be interpreted as saying that b must depend 
on scale. 

The mapping between real and redshift space introduces two effects. On large scales, 
there is the linear increase of power described by Kaiser (1987): 



P(k) -> P(k) 



2Q°- 6 O 1 - 2 
+ 36 + 5b 2 



(4.4.2) 



On small scales, there is the filtering effect of virialized peculiar velocities to deal with; 
measured redshifts will also be of limited precision. These effects can be treated in the 
same way: consider the simplest possible case in which these effective errors constitute a 
Gaussian scatter characterized by some spatial rms a. In azimuthal average, this gives 



fW^)^ (4.«) 

(Peacock 1992); modes at high k are thus only damped by one power of k, rather than 
exponentially. Fisher et al. (1992) show that the combination of these factors describes 
the relation between power spectra in real and redshift space quite accurately. Small-scale 
velocities of about 200 kms -1 rms are observationally appropriate, and QDOT redshifts 
have a typical error of 300 kms -1 , making a total effective a of 4.4/i -1 Mpc. At the largest 
wavenumber considered here (k = 0.2h Mpc -1 ), the corresponding correction factor to the 
power is 0.85. This latter correction is unavoidable, since it is based largely on uncertainties 
in the data. This effect alone alters the best-fitting values of the spectral parameters to 
a 8 = 0.92 ± 0.07 and Qh = 0.28 ± 0.08. 
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The nonlinear distortions of the spectrum can be dealt with analytically (assuming 
= 1) by using the remarkable formulae given by Hamilton et al. (1991). The result 
for CDM-like spectra is that power is removed from wavenumbers k ~ O.l/i Mpc -1 , which 
lowers the apparent value of Qh. 

The result of applying all these corrections can be expressed in terms of the change 
in the best-fitting values of Qh and a 8 as a function of b: 

ba 8 — 0.92 — 0.18/6 ' 8 (recovered linear value) 

(4.4.4) 

Mi ~ 0.28 + 0.05/6 1 ' 3 (recovered linear value) 

Consistency with the values ag = 0.57 deduced by White, Efstathiou & Frenk (1993) 
requires b = 1.37 and Qh = 0.31 ± 0.08. This last figure is in remarkable agreement with 
the Vth = 0.32 ± 0.07 deduced from the cluster-galaxy cross-correlation function by Mo, 
Peacock & Xia (1993). 

If the density parameter is low, the linear amplification of power in redshift space 
does not occur, so that the nonlinear real-space value of bag will be higher: close to the 
observed redshift-space value of 0.92. The formulae of Hamilton et al. (1991) do not apply 
to the low-density case, so we cannot say so exactly what the effects of nonlinearities will 
be. However, experience with iV-body simulations suggest that the recovered linear values 
of <t 8 and fi/i will still be altered by an amount of the order 0.1 (for 6 = 1) — as in the 
= 1 case. 
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5. TESTS FOR NON-GAUSSIANITY 

The expectation value of the power that we have tried to estimate contains in itself no 
information about phase correlations or higher than two-point correlations. As we have 
seen however, the estimated power will inevitably fluctuate strongly from point to point 
in /c-space, and these fluctuations about the mean power are rich in information about 
higher-order correlations. 

One probe of non-Gaussianity is from the 1-point (in /c-space that is) distribution 
of the power. According to the Gaussian hypothesis, the power should be exponentially 
distributed: 

p(> p) = exp(-P/P) (5.1) 

(Kendall & Stewart 1977) and this has been exploited as a way to quantify evidence for 
periodicity in pencil-beam surveys for instance (Szalay et al. 1990). The pencil-beam 
analysis did indeed appear to show some evidence for non-Gaussianity: the distribution 
of the power was enhanced at high values of P. However, it can plausibly be argued that 
what is happening here is that the distribution of power was calculated over a wide range of 
wave-numbers and that the high frequency components were suppressed by smoothing by 
redshift errors and by random motions (Kaiser & Peacock 1991). Thus what one sees in the 
pencil-beam case is a blend of exponential distributions with different length scales. One 
way to resolve this would be to restrict the range to low frequencies only, but unfortunately 
then the number of independent 'coherence cells' is rather small (even though the baseline 
for this sample is impressively long). 

Because we work in three dimensions, the present work yields a much larger number 
of independent estimates of the low-frequency power. The distribution of the power for k < 
0.1 (wavelengths > 60/i _1 Mpc) is shown in figure 9. The agreement with the exponential 
prediction is remarkably good. It should be kept in mind that what we are seeing here 
is the distribution of the 'raw' power which contains a superposition of the shot noise 
power, whose long wavelength Fourier components is essentially guaranteed to be Gaussian 
distributed by virtue of the central limit theorem. For the wavelengths employed in figure 
9, the real long-wavelength power is roughly equal to the shot noise power. The fact 
that the exponential law is nevertheless exact out to P/P ~ 10 is thus an extremely 
exacting test of the Gaussian hypothesis — and should provide an important constraint 
on non-Gaussian models such as those based on topological defects. 
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There are potentially many other statistics that one could construct based on the 
Fourier components which would measure interesting high order correlations. One partic- 
ularly interesting one is the two-point function of the power k) = (P(k)P(k + 5k)). 
As we saw in §2.4, if we assume Gaussian fluctuations then k) is simply determined 
from the geometry of the survey (it is essentially the Fourier transform of the survey vol- 
ume) and will therefore have width in 8K on the order of 1/D. It is not difficult to see 
however how this prediction might be modified for certain rather interesting non-Gaussian 
models. Consider, for instance, a density field which is the product of a Gaussian field 
with a 'modulating' field which has only long wavelength components. In such a model, 
the density field would look locally Gaussian, but seen on a larger scale one would have 
patches of greater or lesser amplitude. The micro-scale fluctuations in P(k) in such a 
model would be the same as in a pure Gaussian model but with a patchy survey volume; 
i.e. the two point function would be more extended than predicted by equation (2.4.6). 
The excess width of the x statistic therefore measures the degree to which fluctuations of 
spatial frequency k are being modulated by frequencies ~ 8k. 

6. CONCLUSIONS 

We have presented a formalism for power-spectrum analysis of fully three-dimensional 
deep redshift surveys. Our main new result is an analytical estimation of the statistical 
uncertainties in the power (including both sampling and galaxy counting statistics). We 
have also presented a rigorous analytical formulation of the optimal weight function for 
the data, assuming that the long-wavelength Fourier components are Gaussian distributed. 
Assuming that the power is smooth, we have shown how to derive the full covariance matrix 
for the power spectrum. This provides all the information necessary for a proper statistical 
comparison between power-spectrum data and theory. 

We have applied the method to the updated l-in-6 QDOT IRAS redshift survey. 
We find that survey depths in excess of 100/i -1 Mpc are necessary in order to obtain a 
stable estimate of the power spectrum. Our results strengthen and quantify the impression 
that there is extra power on large scales as compared to the standard CDM model with 
Qh ~ 0.5. Nevertheless, there appears to be a break in the power spectrum at wavelengths 
A ~ 150 — 200/i _1 Mpc, with sharply reduced power for larger wavelengths. This is 
consistent with the picture emerging from a number of other studies. 

We have applied likelihood analysis using the BBKS approximation of the CDM 
spectrum with Vth as a free parameter as a phenomenological family of models: in redshift 
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space the best-fitting parameters are cr 8 = 0.88 ± 0.07, flh = 0.25 ± 0.08. We have 
attempted to treat the distortions to the power spectrum introduced by nonlinear evolution 
and the redshift-space mapping, and so recover the parameters which describe the linear 
power spectrum. If the linear rms variance is taken to agree with White et al. (1993) 
(a 8 = 0.57), we find that a linear power spectrum with Qh = 0.31 ± 0.08 is implied, 
in excellent agreement with the figure deduced from the cluster-galaxy cross-correlation 
function by Mo et al. (1993). 

We have calculated the distribution of the estimated long-wavelength power, and 
searched for signs of non-Gaussianity in the 1-point (in /c-space) distribution of the power. 
We found no trace of non-Gaussian behavior; rather, the distribution agreed exceptionally 
well with the exponential distribution expected for Gaussian fluctuations and we found 
no sign of periodicity or any particularly strong spatial frequencies. What is needed now 
is a well motivated non-Gaussian model with which to compare this strong observational 
constraint. 
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Figure Captions 

Fig. 1) The IRAS QDOT galaxy distribution (open circles) in the range of 20/i _1 Mpc < 
R < 500/i _1 Mpc and the angular mask (solid squares). Each masked bin is 1° x 1°. 

Fig. 2) A histogram of the IRAS QDOT survey and a % 2 fit of Eqn. 2.5.2. (solid line) used 
as the radial selection function that defines the synthetic catalogue and the mean 
space density of the galaxies. 

Fig. 3) The optimal weight function parameterized by the power P(k). By varying the 
assumed power over values that cover the observed power, we in effect produce 
different effective survey depths. 

Fig. 4) The power spectrum for the full IRAS QDOT survey using four weight function 
parameterizations. We see that the larger the power in the weight function is (i.e. 
the greater the effective depth of the survey) the more power we get. The effect 
is most marked at the shallow end: we gain roughly a factor 1.5 in power when 
the assumed power changes from 2000 (/i _1 Mpc) 3 to 4000 (/i _1 Mpc) 3 , but things 
change relatively little thereafter. This suggests that the effect is a local one that 
represents true sampling fluctuations, but that the correct average power is detected 
for greater depths. 

Fig. 5) A comparison of the IRAS QDOT survey with P(k) = 2000 (h' 1 Mpc) 3 in the 
weight function with the 1.2-Jy IRAS survey. Comparing with figure 3, we see that 
our shallowest weight gives roughly constant weighting to volumes corresponding 
to the 1.2-Jy survey, which is why we have presented the comparison in this way. 
Once scaled to the same depth, the surveys agree reasonably well. The 1.2-Jy data 
lie slightly below our shallow results; however, there would be a much more marked 
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discrepancy if we had chosen to compare to the 'standard' P = 8000(/i 1 Mpc) 3 
QDOT results - about a factor of 1.5 in power. 

Fig. 6) A comparison the IRAS QDOT survey with P(k) = 16000 (h' 1 Mpc) 3 in the weight 
function with the CfA surveys. As with QDOT, the CfA results appear to show 
some weak trend for the power to increase with increasing sample depth. The shapes 
of the CfA and QDOT power spectra appear to differ: similar levels of power are 
seen at large wavelength, but the CfA spectra show much more small-scale power. 
However, note the relative sizes of the large-wavelength error bars in the CfA data, 
plus the fact that their analysis gives no idea of the extent of any cross-correlation 
between different points. 

Fig. 7) A comparison of the IRAS QDOT survey with P(k) = 8000 (h' 1 Mpc) 3 in the 
weight function with some theoretical models normalized at k = O.lh Mpc -1 . 1) 
For the BBKS linear CDM model with flh = 0.5, the poor fit is apparent. Whether 
one regards this as suggesting an excess of large-scale power or a lack of small- 
scale power is a matter both of taste and of where we choose to normalize. 2) The 
APM fitting function gives a reasonable fit to the IRAS QDOT power spectrum. 
The overall shape agrees quite well with the APM data; however, the position of 
the break that is required to best-fit the APM data appears to be excluded by our 
results. 3) MDM simulations agree quite well with the IRAS QDOT data. The 
MDM scenario gives results that are rather similar to those of CDM models with 
low fi/i or 'tilted' spectra. 4) P(k) oc k~ 1A gives good agreement for k > O.OAh 
Mpc" 1 . 

Fig. 8) Likelihood contours at — InL = minimum + 0.5, 1,2.... The 95% confidence level 
would be at A InL = 3. This plot shows that the maximum-likelihood model is 
well defined (and is an excellent description of the data: % 2 = 12 on 18 degrees of 
freedom) . 

Fig. 9) The distribution of the power for k < 0.1 (A > 60/i -1 Mpc). The agreement with 
the exponential prediction is remarkable. It should be kept in mind that what we 
are seeing here is the distribution of the 'raw' power which contains a superposition 
of the shot noise power, whose long wavelength Fourier components is essentially 
guaranteed to be Gaussian distributed by virtue of the central limit theorem. For 
the wavelengths employed here, the real long-wavelength power is roughly equal 
to the shot noise power. The fact that the exponential law is nevertheless exact 
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out to P/P ~ 10 is thus an extremely exacting test of the Gaussian hypothesis - 
and should provide an important constraint on non-Gaussian models such as those 
based on topological defects. 

APPENDIX A 



Two-Point Function For a Poisson- Sample Point Process 

If we have a point process n(r) which is a "Poisson sample" of some continuous stochastic 
field 1 + /(r) with a given mean density of points n(r) (i.e. the probability that an 
infinitesimal volume element 5V contains an object is n(r)[l + f (r)]5V) then the two-point 
function of n(r) is 

(ra(r)ra(r')) = n(r)n(r') [1 + f (r - r')] + n(r)6(r - r) (Al) 

where e(r) = (/(r')/(r' + 

To prove this, consider the expectation value of 

J d 3 r J d 3 r' g(r,r')n(r)n(r') (A2) 

where g(r, r') is an arbitrary function. Using the standard procedure (Peebles 1980, §36) 
of converting such integrals to sums over infinitesimal microcells with occupation numbers 
rii = 0, 1 and using (n^nj) = n(ri)n(rj)5V 2 [l+^(ri— rj)] if i ^ j and (nf) = (rii) = n{vi)5V 
we find 

Q d 3 r J e*V<7(r,r>(r)n(r')^ = J d 3 r J d 3 r' g(r, r')(n(r)n(r')) 

= YY g ^ v ^ v ^^ n ^ 

i j 

= Yl Y 9 ^ r i)"( r *)^( r j)[ i + f ( r * - y j)w 2 + Y g ( Ti > 

i j i 

= J d 3 r J d 3 r' g(r,r')n(r)n(r')[l + f(r- r')] + J d 3 r g(r,r)n(r) (A3) 

= J d 3 r J d 3 r'g(r, r') {n(r)ra(r')[l + £(r - r')] + n(r)8(r - r')} 

Since this must be true for an arbitrary function g(r, r') then comparing the first and last 
lines of (A3) we obtain the identity (Al). 
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APPENDIX B 



Fluctuations in power for a Gaussian field 

We need to evaluate the two point function of fluctuations in the power (5P(k)<5P(k')) 
[where SP(k) = P(k) - P(jfe)]: 

(5P(k)5P(k')} = (P(k)P(k')) - (P(k))(P(k')>. (SI) 

To make progress, we shall assume that the large-wavelength portion of the power spec- 
trum describes a Gaussian field. A possible way of proceeding would then be to consider 
separately the real and imaginary parts of the Fourier field F(k) = Ck + i «k, write down 
all relevant correlations ((ckSk') etc.) and use the general relation for a bivariate Gaussian 

(x V> = (x 2 ) (y 2 ) + 3(xy) ^WW) ~ M 2 - (B2) 

A less cumbersome method is to appeal to the idea that realizations of Gaussian processes 
in k space can be obtained by Fourier transforming a set of independent Gaussian random 
variables in real space: 

F(k) = ^^e lk - ri . (S3) 

i 

In these terms, the power (|F(k)| 2 ) = J2i9i an d the two-point function in k space is 
(F(k)F*(k')) = Y^iQi e*( k-k ') ' ri . If we now write down the two-point function of the 
power in terms of this expansion, we obtain a fourfold product (gigjgkge), which only gives 
a nonzero result when two pairs of indices are equal - something that can happen in four 
distinct ways. The two-point function for the power is then 

(F(k)F(k')) = $>f> + J2 T,<9i) (9]) [l + e^ k+k ')-^-'j) + e'O'-k'Mr.-rj)] . (jB4 ) 

i i j=£i 

Since in the Gaussian case (gf) = 3(g^) 2 , this neatly allows the double sum to be made 
unrestricted. If the term involving k+k' is ignored on the grounds that its rapid oscillations 
will give a result negligible by comparison with the one involving k — k', then we obtain 
the desired result: 

(F(k)P(k')) = (P(k))(P(k')> + |<F(k)F*(k')>| 2 . (B5) 
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